AITopics | Nancy

Collaborating Authors

Nancy

Unveiling Biases while Embracing Sustainability: Assessing the Dual Challenges of Automatic Speech Recognition Systems

Kulkarni, Ajinkya, Kulkarni, Atharva, Couceiro, Miguel, Trancoso, Isabel

arXiv.org Artificial IntelligenceMar-2-2025

Unveiling Biases while Embracing Sustainability: Assessing the Dual Challenges of Automatic Speech Recognition Systems Ajinkya Kulkarni 1, 2, Atharva Kulkarni 3, Miguel Couceiro 4, 5, Isabel Trancoso 5 1 IDIAP, Switzerland, 2 MBZUAI, UAE, 3 Erisha Labs, India 4 Universit e de Lorraine, CNRS, LORIA, Nancy, France 5 INESC-ID, IST, Universidade de Lisboa, Portugal ajinkya.kulkarni@idiap.ch Abstract In this paper, we present a bias and sustainability focused investigation of Automatic Speech Recognition (ASR) systems, namely Whisper and Massively Multilingual Speech (MMS), which have achieved state-of-the-art (SOT A) performances. Despite their improved performance in controlled settings, there remains a critical gap in understanding their efficacy and equity in real-world scenarios. In addition, we examine the environmental impact of ASR systems, scrutinizing the use of large acoustic models on carbon emission and energy consumption. We also provide insights into our empirical analyses, offering a valuable contribution to the claims surrounding bias and sustainability in ASR systems. Index T erms: ASR, Bias, carbon footprint, sustainability 1. Introduction The advent of large deep neural networks (DNNs) has brought about substantial advancements in various speech-processing applications, notably in speech recognition.

artificial intelligence, energy consumption, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.21437/Interspeech.2024-2494

2503.00907

Country:

Asia > India (0.34)
Europe > Portugal > Lisbon > Lisbon (0.24)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.24)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas (0.37)
Law > Environmental Law (0.35)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Inverse Reinforcement Learning through Structured Classification Supélec - IMS-MaLIS Research Group Nancy, France

Neural Information Processing SystemsMar-14-2024, 08:02:21 GMT

This paper adresses the inverse reinforcement learning (IRL) problem, that is inferring a reward for which a demonstrated expert behavior is optimal. We introduce a new algorithm, SCIRL, whose principle is to use the so-called feature expectation of the expert as the parameterization of the score function of a multiclass classifier. This approach produces a reward function for which the expert policy is provably near-optimal. Contrary to most of existing IRL algorithms, SCIRL does not require solving the direct RL problem. Moreover, with an appropriate heuristic, it can succeed with only trajectories sampled according to the expert behavior. This is illustrated on a car driving simulator.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country: Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Parametric-Task MAP-Elites

Anne, Timothée, Mouret, Jean-Baptiste

arXiv.org Artificial IntelligenceFeb-2-2024

Optimizing a set of functions simultaneously by leveraging their similarity is called multi-task optimization. Current black-box multi-task algorithms only solve a finite set of tasks, even when the tasks originate from a continuous space. In this paper, we introduce Parametric-task MAP-Elites (PT-ME), a novel black-box algorithm to solve continuous multi-task optimization problems. This algorithm (1) solves a new task at each iteration, effectively covering the continuous space, and (2) exploits a new variation operator based on local linear regression. The resulting dataset of solutions makes it possible to create a function that maps any task parameter to its optimal solution. We show on two parametric-task toy problems and a more realistic and challenging robotic problem in simulation that PT-ME outperforms all baselines, including the deep reinforcement learning algorithm PPO.

evolutionary algorithm, machine learning, regression, (17 more...)

arXiv.org Artificial Intelligence

2402.01275

Country:

North America > United States > New York (0.14)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.14)

Genre: Research Report (0.84)

Industry:

Transportation (1.00)
Energy > Oil & Gas (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.89)

Add feedback

Cross Domain Early Crop Mapping using CropGAN and CNN Classifier

Wang, Yiqun, Huang, Hui, State, Radu

arXiv.org Artificial IntelligenceJan-14-2024

Driven by abundant satellite imagery, machine learning-based approaches have recently been promoted to generate high-resolution crop cultivation maps to support many agricultural applications. One of the major challenges faced by these approaches is the limited availability of ground truth labels. In the absence of ground truth, existing work usually adopts the "direct transfer strategy" that trains a classifier using historical labels collected from other regions and then applies the trained model to the target region. Unfortunately, the spectral features of crops exhibit inter-region and inter-annual variability due to changes in soil composition, climate conditions, and crop progress, the resultant models perform poorly on new and unseen regions or years. This paper presents the Crop Generative Adversarial Network (CropGAN) to address the above cross-domain issue. Our approach does not need labels from the target domain. Instead, it learns a mapping function to transform the spectral features of the target domain to the source domain (with labels) while preserving their local structure. The classifier trained by the source domain data can be directly applied to the transformed data to produce high-accuracy early crop maps of the target domain. Comprehensive experiments across various regions and years demonstrate the benefits and effectiveness of the proposed approach. Compared with the widely adopted direct transfer strategy, the F1 score after applying the proposed CropGAN is improved by 13.13% - 50.98%

artificial intelligence, machine learning, target domain, (15 more...)

arXiv.org Artificial Intelligence

2401.07398

Country:

North America > United States > North Dakota (0.14)
North America > United States > Minnesota (0.14)
Oceania > Australia > New South Wales (0.14)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.14)

Genre: Research Report (1.00)

Industry:

Food & Agriculture > Agriculture (1.00)
Government > Regional Government > North America Government > United States Government (0.67)
Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.38)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Stability of Q-Learning Through Design and Optimism

Meyn, Sean

arXiv.org Artificial IntelligenceAug-21-2023

Q-learning has become an important part of the reinforcement learning toolkit since its introduction in the dissertation of Chris Watkins in the 1980s. The purpose of this paper is in part a tutorial on stochastic approximation and Q-learning, providing details regarding the INFORMS APS inaugural Applied Probability Trust Plenary Lecture, presented in Nancy France, June 2023. The paper also presents new approaches to ensure stability and potentially accelerated convergence for these algorithms, and stochastic approximation in other settings. Two contributions are entirely new: 1. Stability of Q-learning with linear function approximation has been an open topic for research for over three decades. It is shown that with appropriate optimistic training in the form of a modified Gibbs policy, there exists a solution to the projected Bellman equation, and the algorithm is stable (in terms of bounded parameter estimates). Convergence remains one of many open topics for research. 2. The new Zap Zero algorithm is designed to approximate the Newton-Raphson flow without matrix inversion. It is stable and convergent under mild assumptions on the mean flow vector field for the algorithm, and compatible statistical assumption on an underlying Markov chain. The algorithm is a general approach to stochastic approximation which in particular applies to Q-learning with "oblivious" training even with non-linear function approximation.

artificial intelligence, machine learning, reinforcement learning, (21 more...)

arXiv.org Artificial Intelligence

2307.02632

Country:

North America > United States > Massachusetts (0.28)
North America > United States > New York (0.28)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.24)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.86)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.34)

Add feedback

Learnable Nonlinear Compression for Robust Speaker Verification

Liu, Xuechen, Sahidullah, Md, Kinnunen, Tomi

arXiv.org Artificial IntelligenceFeb-10-2022

In this study, we focus on nonlinear compression methods in spectral features for speaker verification based on deep neural network. We consider different kinds of channel-dependent (CD) nonlinear compression methods optimized in a data-driven manner. Our methods are based on power nonlinearities and dynamic range compression (DRC). We also propose multi-regime (MR) design on the nonlinearities, at improving robustness. Results on VoxCeleb1 and VoxMovies data demonstrate improvements brought by proposed compression methods over both the commonly-used logarithm and their static counterparts, especially for ones based on power function. While CD generalization improves performance on VoxCeleb1, MR provides more robustness on VoxMovies, with a maximum relative equal error rate reduction of 21.6%.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2202.05236

Country: Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.89)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Optimizing Multi-Taper Features for Deep Speaker Verification

Liu, Xuechen, Sahidullah, Md, Kinnunen, Tomi

arXiv.org Artificial IntelligenceOct-21-2021

Multi-taper estimators provide low-variance power spectrum estimates that can be used in place of the windowed discrete Fourier transform (DFT) to extract speech features such as mel-frequency cepstral coefficients (MFCCs). Even if past work has reported promising automatic speaker verification (ASV) results with Gaussian mixture model-based classifiers, the performance of multi-taper MFCCs with deep ASV systems remains an open question. Instead of a static-taper design, we propose to optimize the multi-taper estimator jointly with a deep neural network trained for ASV tasks. With a maximum improvement on the SITW corpus of 25.8% in terms of equal error rate over the static-taper, our method helps preserve a balanced level of leakage and variance, providing more robustness.

artificial intelligence, machine learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/LSP.2021.3122796

2110.10983

Country:

North America > United States (0.28)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.73)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (0.73)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Optimized Power Normalized Cepstral Coefficients towards Robust Deep Speaker Verification

Liu, Xuechen, Sahidullah, Md, Kinnunen, Tomi

arXiv.org Artificial IntelligenceSep-24-2021

After their introduction to robust speech recognition, power normalized cepstral coefficient (PNCC) features were successfully adopted to other tasks, including speaker verification. However, as a feature extractor with long-term operations on the power spectrogram, its temporal processing and amplitude scaling steps dedicated on environmental compensation may be redundant. Further, they might suppress intrinsic speaker variations that are useful for speaker verification based on deep neural networks (DNN). Therefore, in this study, we revisit and optimize PNCCs by ablating its medium-time processor and by introducing channel energy normalization. Experimental results with a DNN-based speaker verification system indicate substantial improvement over baseline PNCCs on both in-domain and cross-domain scenarios, reflected by relatively 5.8% and 61.2% maximum lower equal error rate on VoxCeleb1 and VoxMovies, respectively.

acoustic processing, normalization, speech recognition, (19 more...)

arXiv.org Artificial Intelligence

2109.12058

Country:

North America > United States (0.28)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.14)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Speech > Acoustic Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

IDS unveils camera-based cap control with AI for beverage and bottle industry

#artificialintelligenceApr-3-2021, 13:47:10 GMT

APREX Solutions from Nancy, France has successfully achieved this goal with the help of image processing technology and artificial intelligence.

artificial intelligence, beverage and bottle industry, id unveil camera-based cap control

#artificialintelligence

Country: Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.55)

Industry: Media > News (0.71)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Using exoskeletons to assist medical staff during prone positioning of mechanically ventilated COVID-19 patients: a pilot study

Ivaldi, Serena, Maurice, Pauline, Gomes, Waldez, Theurel, Jean, Wioland, Liên, Atain-Kouadio, Jean-Jacques, Claudon, Laurent, Hani, Hind, Kimmoun, Antoine, Sellal, Jean-Marc, Levy, Bruno, Paysant, Jean, Malikov, Sergueï, Chenuel, Bruno, Settembre, Nicla

arXiv.org Artificial IntelligenceFeb-11-2021

We conducted a pilot study to evaluate the potential and feasibility of back-support exoskeletons to help the caregivers in the Intensive Care Unit (ICU) of the University Hospital of Nancy (France) executing Prone Positioning (PP) maneuvers on patients suffering from severe COVID-19-related Acute Respiratory Distress Syndrome. After comparing four commercial exoskeletons, the Laevo passive exoskeleton was selected and used in the ICU in April 2020. The first volunteers using the Laevo reported very positive feedback and reduction of effort, confirmed by EMG and ECG analysis. Laevo has been since used to physically assist during PP in the ICU of the Hospital of Nancy, following the recrudescence of COVID-19, with an overall positive feedback.

exoskeleton, immunology, vascular disease, (18 more...)

arXiv.org Artificial Intelligence

2102.0876

Country: Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.26)

Genre:

Questionnaire & Opinion Survey (0.97)
Research Report > New Finding (0.93)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Assistive Technologies (1.00)

Add feedback